Optimizing Spoken Dialogue Management from Data Corpora with Fitted Value Iteration
نویسندگان
چکیده
In recent years machine learning approaches have been proposed for dialogue management optimization in spoken dialogue systems. It is customary to cast the dialogue management problem into a Markov Decision Process (MDP) and to find the associated optimal policy using Reinforcement Learning (RL) algorithms. Yet, the dialogue state space is usually very large (even infinite) and standard RL algorithms fail to handle it. In this paper we explore the possibility of using a generalization framework for dialogue management which is a particular fitted value iteration algorithm (namely fitted-Q iteration). We show that fitted-Q, when applied to continuous state space dialogue management problems, can generalize well and makes efficient use of samples to learn the approximate optimal stateaction value function. Our experimental results show that fittedQ performs significantly better than the hand-coded policy and relatively better than the policy learned using least square policy iteration (LSPI), another generalization algorithm.
منابع مشابه
Optimizing spoken dialogue management with fitted value iteration
In recent years machine learning approaches have been proposed for dialogue management optimization in spoken dialogue systems. It is customary to cast the dialogue management problem into a Markov Decision Process (MDP) and to find the associated optimal policy using Reinforcement Learning (RL) algorithms. Yet, the dialogue state space is usually very large (even infinite) and standard RL algo...
متن کاملOn-Line Learning of a Persian Spoken Dialogue System Using Real Training Data
The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...
متن کاملOn-Line Learning of a Persian Spoken Dialogue System Using Real Training Data
The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...
متن کاملA Portuguese spoken and multi
This paper presents an overview of the spoken and multimodal dialog Portuguese corpora collected in the context of the FASiL (Flexible and Adaptive Spoken Language and Multi-Modal Interfaces) project. The project developed a Virtual Personal Assistant application in the Personal Information Management domain, exploiting the state-of-theart of speech and multi-modal technology. The FASiL corpora...
متن کاملOptimising Turn-Taking Strategies With Reinforcement Learning
In this paper, reinforcement learning (RL) is used to learn an efficient turn-taking management model in a simulated slotfilling task with the objective of minimising the dialogue duration and maximising the completion task ratio. Turn-taking decisions are handled in a separate new module, the Scheduler. Unlike most dialogue systems, a dialogue turn is split into microturns and the Scheduler ma...
متن کامل